NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning to Edit Visual Programs with Self-Supervision

Jones, R Kenny; Zhang, Renhao; Ganeshan, Aditya; Ritchie, Daniel (December 2024, NeurIPS '24: 2024 Conference on Neural Information Processing Systems)

We design a system that learns how to edit visual programs. Our edit network consumes a complete input program and a visual target. From this input, we task our network with predicting a local edit operation that could be applied to the input program to improve its similarity to the target. In order to apply this scheme for domains that lack program annotations, we develop a self-supervised learning approach that integrates this edit network into a bootstrapped finetuning loop along with a network that predicts entire programs in one-shot. Our joint finetuning scheme, when coupled with an inference procedure that initializes a population from the one-shot model and evolves members of this population with the edit network, helps to infer more accurate visual programs. Over multiple domains, we experimentally compare our method against the alternative of using only the one-shot model, and find that even under equal search-time budgets, our editing-based paradigm provides significant advantages.
more » « less
Full Text Available
ParSEL: Parameterized Shape Editing with Language

https://doi.org/10.1145/3687922

Ganeshan, Aditya; Huang, Ryan; Xu, Xianghao; Jones, R Kenny; Ritchie, Daniel (December 2024, ACM Transactions on Graphics)

The ability to edit 3D assets with natural language presents a compelling paradigm to aid in the democratization of 3D content creation. However, while natural language is often effective at communicating general intent, it is poorly suited for specifying exact manipulation. To address this gap, we introduce ParSEL, a system that enablescontrollableediting of high-quality 3D assets with natural language. Given a segmented 3D mesh and an editing request, ParSEL produces aparameterizedediting program. Adjusting these parameters allows users to explore shape variations with exact control over the magnitude of the edits. To infer editing programs which align with an input edit request, we leverage the abilities of large-language models (LLMs). However, we find that although LLMs excel at identifying the initial edit operations, they often fail to infer complete editing programs, resulting in outputs that violate shape semantics. To overcome this issue, we introduce Analytical Edit Propagation (AEP), an algorithm which extends a seed edit with additional operations until a complete editing program has been formed. Unlike prior methods, AEP searches for analytical editing operations compatible with a range of possible user edits through the integration of computer algebra systems for geometric analysis. Experimentally, we demonstrate ParSEL's effectiveness in enabling controllable editing of 3D objects through natural language requests over alternative system designs.
more » « less
Full Text Available
Learning to infer generative template programs for visual concepts

Jones, R Kenny; Chaudhuri, Siddhartha; Ritchie, Daniel (July 2024, ICML'24: Proceedings of the 41st International Conference on Machine Learning)

People grasp flexible visual concepts from a few examples. We explore a neurosymbolic system that learns how to infer programs that capture visual concepts in a domain-general fashion. We introduce Template Programs: programmatic expressions from a domain-specific language that specify structural and parametric patterns common to an input concept. Our framework supports multiple concept-related tasks, including few-shot generation and co-segmentation through parsing. We develop a learning paradigm that allows us to train networks that infer Template Programs directly from visual datasets that contain concept groupings. We run experiments across multiple visual domains: 2D layouts, Omniglot characters, and 3D shapes. We find that our method outperforms task-specific alternatives, and performs competitively against domain-specific approaches for the limited domains where they exist.
more » « less
Full Text Available
Improving Unsupervised Visual Program Inference with Code Rewriting Families

https://doi.org/10.1109/ICCV51070.2023.01447

Ganeshan, Aditya; Jones, R. Kenny; Ritchie, Daniel (October 2023, Proceedings of the International Conference on Computer Vision)

Full Text Available
ShapeCoder: Discovering Abstractions for Visual Programs from Unstructured Primitives

https://doi.org/10.1145/3592416

Jones, R. Kenny; Guerrero, Paul; Mitra, Niloy J.; Ritchie, Daniel (August 2023, ACM Transactions on Graphics)

We introduce ShapeCoder, the first system capable of taking a dataset of shapes, represented with unstructured primitives, and jointly discovering (i) usefulabstractionfunctions and (ii) programs that use these abstractions to explain the input shapes. The discovered abstractions capture common patterns (both structural and parametric) across a dataset, so that programs rewritten with these abstractions are more compact, and suppress spurious degrees of freedom. ShapeCoder improves upon previous abstraction discovery methods, finding better abstractions, for more complex inputs, under less stringent input assumptions. This is principally made possible by two methodological advancements: (a) a shape-to-program recognition network that learns to solve sub-problems and (b) the use of e-graphs, augmented with a conditional rewrite scheme, to determine when abstractions with complex parametric expressions can be applied, in a tractable manner. We evaluate ShapeCoder on multiple datasets of 3D shapes, where primitive decompositions are either parsed from manual annotations or produced by an unsupervised cuboid abstraction method. In all domains, ShapeCoder discovers a library of abstractions that captures high-level relationships, removes extraneous degrees of freedom, and achieves better dataset compression compared with alternative approaches. Finally, we investigate how programs rewritten to use discovered abstractions prove useful for downstream tasks.
more » « less
Full Text Available
SHRED: 3D Shape Region Decomposition with Learned Local Operations

https://doi.org/10.1145/3550454.3555440

Jones, R. Kenny; Habib, Aalia; Ritchie, Daniel (December 2022, ACM Transactions on Graphics)

We present SHRED, a method for 3D SHape REgion Decomposition. SHRED takes a 3D point cloud as input and uses learned local operations to produce a segmentation that approximates fine-grained part instances. We endow SHRED with three decomposition operations: splitting regions, fixing the boundaries between regions, and merging regions together. Modules are trained independently and locally, allowing SHRED to generate high-quality segmentations for categories not seen during training. We train and evaluate SHRED with fine-grained segmentations from PartNet; using its merge-threshold hyperparameter, we show that SHRED produces segmentations that better respect ground-truth annotations compared with baseline methods, at any desired decomposition granularity. Finally, we demonstrate that SHRED is useful for downstream applications, out-performing all baselines on zero-shot fine-grained part instance segmentation and few-shot finegrained semantic segmentation when combined with methods that learn to label shape regions.
more » « less
Full Text Available
Neurosymbolic Models for Computer Graphics

https://doi.org/10.1111/cgf.14775

Ritchie, Daniel; Guerrero, Paul; Jones, R. Kenny; Mitra, Niloy J.; Schulz, Adriana; Willis, Karl D.; Wu, Jiajun (May 2023, Computer Graphics Forum)

Full Text Available
PLAD: Learning to Infer Shape Programs with Pseudo-Labels and Approximate Distributions

https://doi.org/10.1109/CVPR52688.2022.00964

Jones, R. Kenny; Walke, Homer; Ritchie, Daniel (June 2022, IEE CVPR 2022)

Full Text Available
Neurosymbolic Models for Computer Graphics

Ritchie, Daniel; Guerrero, Paul; Jones, R. Kenny; Mitra, Niloy; Schulz, Adriana; Willis, Karl D.; Wu, Jiajun (January 2023, Eurographics)

Full Text Available
The Neurally-Guided Shape Parser: Grammar-based Labeling of 3D Shape Regions with Approximate Inference

https://doi.org/10.1109/CVPR52688.2022.01132

Jones, R. Kenny; Habib, Aalia; Hanocka, Rana; Ritchie, Daniel (June 2022, IEE CVPR 2022)

Full Text Available

« Prev Next »

Search for: All records